Goto

Collaborating Authors

 meta architecture search




Meta Architecture Search

Neural Information Processing Systems

Neural Architecture Search (NAS) has been quite successful in constructing state-of-the-art models on a variety of tasks. Unfortunately, the computational cost can make it difficult to scale. In this paper, we make the first attempt to study Meta Architecture Search which aims at learning a task-agnostic representation that can be used to speed up the process of architecture search on a large number of tasks. We propose the Bayesian Meta Architecture SEarch (BASE) framework which takes advantage of a Bayesian formulation of the architecture search problem to learn over an entire set of tasks simultaneously. We show that on Imagenet classification, we can find a model that achieves 25.7% top-1 error and 8.1% top-5 error by adapting the architecture in less than an hour from an 8 GPU days pretrained meta-network. By learning a good prior for NAS, our method dramatically decreases the required computation cost while achieving comparable performance to current state-of-the-art methods - even finding competitive models for unseen datasets with very quick adaptation. We believe our framework will open up new possibilities for efficient and massively scalable architecture search research across multiple tasks.




Reviews: Meta Architecture Search

Neural Information Processing Systems

The authors propose Bayesian Meta Architecture Search (BASE), a method for meta learning neural network architectures and their weights across tasks. The paper frames this problem as an Bayesian inference problem and employs Gumbel-Softmax, reparametrization and optimization embedding, a variation inference method, to optimize a distribution over neural network architectures and their weights across different tasks. Originality: Meta learning neural network architectures is a very natural next step for NAS research, which as not been done so far (at least I'm not aware of any work). It is not only very natural but also very important as it allows to make NAS more scalable and of more practical relevance. The Bayesian view, however, is not really novel, but rather an obvious extension of [1]. In general, the related work section is very short and does not provide a proper summary of the current state of the art in this field of research Quality: BASE is well motivated and derived.


Reviews: Meta Architecture Search

Neural Information Processing Systems

This paper shows how to apply transfer learning from related tasks to NAS. It proposes a variational Bayesian formulation, which is related to earlier work, but the transfer learning approach is novel, and is clearly useful, given that NAS people can assemble a sufficient number of tasks over time. As detailed in the reviews, the experimental validation could be improved.


Meta Architecture Search

Neural Information Processing Systems

Neural Architecture Search (NAS) has been quite successful in constructing state-of-the-art models on a variety of tasks. Unfortunately, the computational cost can make it difficult to scale. In this paper, we make the first attempt to study Meta Architecture Search which aims at learning a task-agnostic representation that can be used to speed up the process of architecture search on a large number of tasks. We propose the Bayesian Meta Architecture SEarch (BASE) framework which takes advantage of a Bayesian formulation of the architecture search problem to learn over an entire set of tasks simultaneously. We show that on Imagenet classification, we can find a model that achieves 25.7% top-1 error and 8.1% top-5 error by adapting the architecture in less than an hour from an 8 GPU days pretrained meta-network.


Meta Architecture Search

Shaw, Albert, Wei, Wei, Liu, Weiyang, Song, Le, Dai, Bo

Neural Information Processing Systems

Neural Architecture Search (NAS) has been quite successful in constructing state-of-the-art models on a variety of tasks. Unfortunately, the computational cost can make it difficult to scale. In this paper, we make the first attempt to study Meta Architecture Search which aims at learning a task-agnostic representation that can be used to speed up the process of architecture search on a large number of tasks. We propose the Bayesian Meta Architecture SEarch (BASE) framework which takes advantage of a Bayesian formulation of the architecture search problem to learn over an entire set of tasks simultaneously. We show that on Imagenet classification, we can find a model that achieves 25.7% top-1 error and 8.1% top-5 error by adapting the architecture in less than an hour from an 8 GPU days pretrained meta-network.